iT邦幫忙

2022 iThome 鐵人賽

DAY 26
0
Software Development

NoSQL: Not Only SQL系列 第 26

[Day 26] Column Family Database:以 Cassandra 為例

  • 分享至 

  • xImage
  •  

建立本機環境

這次一樣使用 Docker 在本機建立相關環境。

Pull Image

取得 cassandra 的 image。

> docker pull cassandra

Run Image

將該 image run 起來,指定 port 1236 (或其他你想要的 port 也可以),並命名為 ironman-cassandra-demo。

> docker run -p 1236:9042 --name ironman-cassandra-demo cassandra

連線

連進該 container 並使用 cqlsh。

> docker exec -ti ironman-cassandra-demo cqlsh

預設會連進 Test Cluster

Connected to Test Cluster at 127.0.0.1:9042
[cqlsh 6.0.0 | Cassandra 4.0.6 | CQL spec 3.4.5 | Native protocol v5]
Use HELP for help.

基本操作

cql 的語法非常接近 sql,所以接下來這些看起來應該會很熟悉。

建立 Keyspace

透過 CREATE KEYSPACE 建立 keyspace,建立時要設定 replication 相關參數,可參考官方網站

因為是本機的練習環境,所以這邊使用 SimpleStrategy 並設定 replication_factor 為 1,但正式開發不應該這樣設定。

cqlsh> CREATE KEYSPACE demo_keyspace WITH replication = { 'class':'SimpleStrategy', 'replication_factor':1 };

接著透過 DESCRIBEDESC 可以看到該 keyspace 的資訊。

cqlsh> DESC KEYSPACE demo_keyspace;

得到結果:

CREATE KEYSPACE demo_keyspace WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'}  AND durable_writes = true;

使用 use 進入該 keyspace。

cqlsh> USE demo_keyspace;
cqlsh:demo_keyspace>

建立 Table

建立 Table 時需要設定,在 Cassandra 中一張張的 Table 就是一個個 Column Family,所以每張 Table 在建立時要先定義好 Row Key (即 Primary Key),Row Key 可以是單一欄位或複合欄位,其他的欄位可以一併定義,或之後再加都可以。

cqlsh:demo_keyspace> CREATE TABLE t1 ( a INT PRIMARY KEY, b INT, c INT);
cqlsh:demo_keyspace> CREATE TABLE t2 ( d INT PRIMARY KEY, e INT, f INT);
cqlsh:demo_keyspace> CREATE TABLE t3 ( g INT PRIMARY KEY, h INT, i INT);

可以透過 DESCRIBEDESC 查看剛剛定義的這些 Keyspace 和 Table。

cqlsh:demo_keyspace> DESC SCHEMA;

得到回傳如下:

CREATE KEYSPACE demo_keyspace WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'}  AND durable_writes = true;

CREATE TABLE demo_keyspace.t1 (
    a int PRIMARY KEY,
    b int,
    c int
) WITH additional_write_policy = '99p'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND cdc = false
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '16', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND extensions = {}
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99p';

CREATE TABLE demo_keyspace.t2 (
    d int PRIMARY KEY,
    e int,
    f int
) WITH additional_write_policy = '99p'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND cdc = false
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '16', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND extensions = {}
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99p';

CREATE TABLE demo_keyspace.t3 (
    g int PRIMARY KEY,
    h int,
    i int
) WITH additional_write_policy = '99p'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND cdc = false
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '16', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND extensions = {}
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99p';

新增資料

新增資料和 SQL 一樣使用 INSERT,語法如下:

cqlsh:demo_keyspace> INSERT INTO t1 (a,b,c) VALUES (1,2,3);
cqlsh:demo_keyspace> INSERT INTO t2 (d,e,f) VALUES (4,5,6);
cqlsh:demo_keyspace> INSERT INTO t3 (g,h,i) VALUES (7,8,NULL);

PK 為必填且不可為 NULL,沒填正確的值會回傳錯誤訊息。

cqlsh:demo_keyspace> INSERT INTO t1 (a,b,c) VALUES (NULL,2,3);
InvalidRequest: Error from server: code=2200 [Invalid query] message="Invalid null value in condition for column a"

若該 PK 已存在,會「更新」該筆資料,不會回傳 PK 衝突,這點需要注意。

cqlsh:demo_keyspace> INSERT INTO t1 (a,b,c) VALUES (1,2,3);
cqlsh:demo_keyspace> INSERT INTO t1 (a,b,c) VALUES (1,2,4);

查詢資料

使用 SELECT 對 Table 進行查詢。要注意的是,如果有下查詢條件(像是 WHERE),在沒建立索引的條件下,查詢條件一定要包含 PK 才可以進行查詢。

cqlsh:demo_keyspace> SELECT * FROM t1;

 a | b | c
---+---+---
 1 | 2 | 4

(1 rows)
cqlsh:demo_keyspace> SELECT * FROM t1 WHERE a = 0;

 a | b | c
---+---+---

(0 rows)

條件沒下 PK 會得到錯誤訊息:

cqlsh:demo_keyspace> SELECT * FROM t1 WHERE b = 0;
InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING"

整體而言 CQL 下起來跟 SQL 非常接近,熟悉 SQL 的人直接使用 CQL 應該沒什麼大問題。但由於 Cassandra 不是關聯式資料庫,務必注意其中的差異,詳細可見官方文件,基本上都有特別標註提醒。


上一篇
[Day 25] Column Family Database:以 Cassandra 為例
下一篇
[Day 27] Column Family Database:以 Cassandra 為例
系列文
NoSQL: Not Only SQL30
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言